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Abstract 

The latest global financial tsunami and its follow-up global economic recession has uncovered the crucial 
impact of housing markets on financial and economic systems. The Chinese stock market experienced a 
markedly fall during the global financial tsunami and China’s economy has also slowed down by about 
2%-3% when measured in GDP. Nevertheless, the housing markets in diverse Chinese cities seemed to 
continue the almost nonstop mania for more than ten years. However, the structure and dynamics of 
the Chinese housing market are less studied. Here we perform an extensive study of the Chinese housing 
market by analyzing ten representative key cities based on both linear and nonlinear econophysical and 
econometric methods. We identify a common collective driving force which accounts for 96.5% of the house 
price growth, indicating very high systemic risk in the Chinese housing market. The ten key cities can be 
categorized into clubs and the house prices of the cities in the same club exhibit an evident convergence. 
These findings from different methods are basically consistent with each other. The identified city clubs 
are also consistent with the conventional classification of city tiers. The house prices of the first-tier 
cities grow the fastest, and those of the third- and fourth-tier cities rise the slowest, which illustrates the 
possible presence of a ripple effect in the diffusion of house prices in different cities. 


Introduction 

The U.S. housing market experienced a continuous rise since the 1990’s, which was driven by diverse 
factors such as the wealth effect and the inflow of international capitals [1] . According to the log-periodic 
power-law model [ 2113 ], no bubble was detected in the US housing market in 2003 |4]. However, in 2005, 
evident signatures of a housing bubble were identified [5] , and a strikingly accurate forecast was released 
in the Abstract of Ref. stating that: “From the analysis of the S&P 500 Home Index, we conclude 
that the turning point of the bubble will probably occur around mid-2006.” The turndown of the US 
house prices measured by the S&P Case-Shiller house price index was indeed fulfilled in 2006, which 
triggered the outbreak of the US subprime mortgage crisis in 2007. The aftermath was very severe. It 
caused the credit crisis in the US and a national crisis in US’s financial markets. The US financial crisis 
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diffused to the worldwide financial markets and hastened a global hnancial crisis in 2008. In this case, 
stock markets acted perfectly as the barometer of real economies and a global economic recession followed 
unavoidably. What followed further was the European sovereign debt crisis. The worldwide economies 
are still struggling on the way to recover. The outline of this story demonstrates the crucial role played 
by an economy’s housing market. 

In the past three decades, China’s economy experienced an unprecedented growth with an average 
growth rate of about 10% and the capitalization of the Chinese stock market has become one of the 
largest all over the world. During the global financial tsunami, the Chinese stock market bubble bust 
and dropped by about 80% with the Shanghai Stock Exchange Composite Index plummeted from its 
historical high at 6124 on 16 October 2007 to 1664 on 28 October 2008 [6], and China’s economy has 
also slowed down by about 2%-3% when measured in GDP. Nevertheless, the housing markets in diverse 
Chinese cities seemed to continue the almost nonstop mania for more than ten years. In late 2003, the 
Shanghai House Price Composite Index has exhibited signatures of an undoubtable bubble in store [7] . In 
early 2008, the Shanghai housing market dropped mildly and then continued to soar, which was partially 
fuelled by the government bailout of 40 trillion Chinese yuan in November 2008. 

There are still debates on whether there is a housing bubble in China. However, the consensus is apt 
to the presence of a bubble and many people think or hope that the bubble will crash sooner or later, 
although there are also many people deny the possibility of bubble burst, including some officials and 
house builders. Nevertheless, the possibility that the housing market will crash nationwide is a sword 
hanged on the development of China’s economy. A crash of the housing market will cause severe damages 
to the economy and even cause social problems. 

It has been well recognized that studying complex economic and financial systems under the framework 
of complex networks has crucial scientific significance, because the units or agents in complex systems 
interact with each other in a nonlinear manner [8]. In this work, we will investigate the correlation 
structure of house price indexes of 10 key cities in China based on both linear and nonlinear econophysical 
and econometric methods, which is closely related to the systemic risk of the national housing market 
[SHE]. We identify city clubs and club convergence in the house price indexes and high systemic risk in 
the national housing market. 


Materials and Methods 

Data sets 

We use the monthly house price composite index (HPI) data for 10 key cities of China covering the pe¬ 
riod from January 2005 to November 2013, which were retrieved from the China Real Estate Index System 
(CREIS) of the China Index Academy. The data are publicly available at http://fdc.fang.com/index/XinEangIndex.aspx 
The 10 key cities include Beijing (BJ), Shanghai (SH), Guangzhou (GZ), Shenzhen (SZ), Tianjin (TJ), 

Wuhan (WH), Chongqing (CQ), Nanjing (NJ), Hangzhou (HZ), Chengdu (CD). The house price indexes 
are constructed as the Urban Comprehensive Index which takes houses of residence, office edifice and 
commercial shop into account. The HPIs of these key cities are regarded as the vane of China’s real 
estate market. Eigure[T] illustrates the HPI time series yi{t) (t = 1, 2, • • • , T) of city i. Because the data 
of the first 6 months are unavailable for Wuhan, Hangzhou and Chengdu, we investigate the HPIs since 
July 2005, containing 101 data points for each city. 

Correlation matrix and random matrix theory 

The random matrix theory (RMT) has been long applied in the econophysics community [T3lll7) . The 
similarity between two time series yi(t) and yj{t) is commonly calculated by the Pearson correlation 
coefficient as follows: 
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Figure 1. Evolution of the Urban Comprehensive Index of 10 key cities in China. All the 

indexes have risen during the time period under investigation. 




( 1 ) 


where {) is the calculation of mean and ai is the standard deviation of time series yi{t). We study the 
raw correlation matrix C, whose elements Cij are the Pearson correlation coefficients between various 
pairs of time series yi and yj. 

The eigenvalues and eigenvectors of C provide important information. In terms of the principal com¬ 
ponent analysis, for N time series, the eigenvectors Vi (z = 1, 2,..., N) of the correlation matrix C are a 
full set of orthogonal axis in space which could decompose the total variability of all the time series into 
several orthogonal sub-variabilities by projecting observations on the axis. This set of decompositions 
suggests that the variability summarized by the first (largest) eigenvalue Ai and its corresponding eigen¬ 
vector vi is the maximum among all possible orthogonal choice of the axis, the second largest eigenvalue 
A 2 and its corresponding eigenvector V 2 then summarize the maximum variation in the unexplained por¬ 
tion of the original series after excluding the information explained by Ai, and so on up to the smallest 
eigenvalue Xn- The percent of variability explained by projecting observations on each eigenvector Vi can 
be calculated as follows: 


= 


X^ 

EtiAfe 


A 

N' 


( 2 ) 


We could also calculate the cumulative percent up to the zth eigenvalue as follow: 






Aa; 

Afc 


(3) 


which is also called the absorption ratio [5] and is a measure of systemic risk m- 

Applications of the random matrix theory (RMT) to stock market [IBIITS] and housing market [l2| 
show that, the largest eigenvalue Ai and its corresponding eigenvector vi characterized the collective 
response of the entire market to a common stimuli. If there is a strong collective behavior in the market. 
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all components would participate almost identically in the vi , representing an influence that is common 
to all stocks. Ai and its corresponding Lpi would be extremely large, interpreting most of the variability in 
the observations. One could further unravel some grouping information from other largest eigenvalues and 
their associated eigenvectors which deviate from the RMT predictions. Besides, the smallest eigenvalue 
A AT and its corresponding eigenvector vn could highlight pairs of stocks with a correlation coefficient 
much larger than the average, namely “decoupling” from other stocks [I61IT9] . 

Box clustering method 

The box clustering method has been applied to search for element clusters of the correlation matrix [5D]. 
It first determines the optimal ordering of matrix elements to ensure that the correlation matrix has a 
nested block-diagonal structure, where the simulated annealing approach is adopted to minimize the cost 
function: 

N 

Q= E (4) 

i,j=l 

where Cij could be the element of raw or partial correlation matrix in this work. Then a greedy algorithm 
is implemented to partition time series into clusters. The procedures should be repeated n times and 
obtain n different partitions of HPI clusters [21]. An affinity matrix A is obtained, whose element 
is the number of partitions in which series yi{t) and yj(t) are assigned to the same cluster, divided by 
the number of partitions n. We take a typical number of n = 1000 here. Finally we apply the clustering 
method to the affinity matrix A itself, resulting in a final partition of the time series. 

Partial correlations 

The concept of partial correlation is a powerful tool to investigate the intrinsic correlation between two 
time series effected by common factors [22] and has been applied in stock markets [23lf25] and housing 
markets [12]. For time series yi{t), i = 1,2,..., N with a common collective trend G{t), we can extract 
their idiosyncratic components Si(t) by calibrating the following simple univariate factor model: 

yiit) = ai + PiGit) + ei{t). (5) 

When there are more than one common factors, the above regression can be easily extended to the 
multivariate form. The correlation matrix of Si (t) is the partial correlation matrix P of the original time 
series, whose elements Pij depict the residual correlations between yi(t) and yj(t) after removing the 
impact of the market-wide collective effect G{t) which is the eigenportfolio of the largest eigenvalue. 

Decomposition of correlation matrix 

With the complete set of eigenvalues eigenvalues and eigenvectors, the correlation matrix C can be 
expressed as follows: 

N 

C = Y^v,X.v[. (6) 

Then, we can decompose the correlation matrix into three parts as [13127] 

Ng N 

C = Cjn + Cg 4- Cr = ViXiv'i + ViXjv'i 4- E] '^jXjv'j, (7) 

i = 2 j = JVg + l 

where the first component Cm represents a market mode reflecting collective behavior driven by a common 
influencing force, the second component Cg stands for the correlation structure with the bulk eigenvalues 
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reflecting the partitioning of time series, and the third component Cr is the random noise terms. The 
determine of the market mode Cm is trivial. The determine of IVg for Cg is not straightforward. One 
can use the eigenvalues that deviating the prediction of the RMT m or estimate through econometric 
method [55]. Because we have only 10 time series, we do not distinguish Cg and and adopt the 
following simple decomposition: 

C = + Cfc, (8) 


where 


N 

Cm = ViXiv[ and Cb = '^ViXiv[. 

i=2 


Note that, when N is large, it is necessary to further extract the noise part C^. 


(9) 


The logt test 

The log t test proposed by Phillips and Sul is based on a nonlinear time varying factor model and provides 
a framework for modeling the transitional dynamics as well as long-run behaviors [29] . For the time series 
we can represent it with a time varying common factor: 

Viit) = ( 10 ) 


where fj,(t) is a single common component and Si{t) is a time varying idiosyncratic element which captures 
the deviation of i from the common path defined by Following the previous work |29| . we eliminate 
the cyclical components by applying the HP filter [50] and extract the trend components yhp,i(t) of yi{t) 
as the analyzing series. Within this framework, all N time series will converge, at some point in the 
future, to the steady state if lim^^oo 6i{t + k) = S ioT all i = 1,2,N, irrespective of whether the series 
are near the steady state or in transition. It is important given that the paths to the steady state across 
the time series can be significantly different. Since Si{t) cannot be directly estimated from Eq. (flOll . 
Phillips and Sul eliminate the common component y(t) through rescaling by the panel average |29j : 


hi{t) 


yi{i) 




( 11 ) 


The relative transition measurement hi (t) captures the transition path with respect to the panel average, 
which is analogical with the differential Di{t) in Eq. (I16|) . In order to define a formal econometric test of 
convergence as well as an empirical algorithm of defining club convergence, the following semi-parametric 
form for the time varying coefficients 6i{t) is assumed: 


= 5i +ai{t)^,{t). 


( 12 ) 


where ai{t) = u > 0,t > 0, and ^i{t) is weakly dependent upon t but is iid(0, 1) over i. 

The function L(t) is a slow varying function, increasing and divergent at infinity (T(t) = Int in the 
present report). Under this specific form for di{t), the null hypothesis T-Lq of convergence for all i and the 
alternative hypothesis Hi of non-convergence for some i are expressed as follows: 

J Ho : 6i = S and a > 0 

[ Hi: 6,^6 or a <0 ' ^ 

Phillips and Sul demonstrate that the null of convergence can be tested in the framework of the following 
regression [29] : 


\n{Hi/Ht) — 2 InL(t) = c + bint + Ut 


(14) 
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for t = [rT], [rT] + 1,. .. ,r, where 0 < r < 1 (we use r = 0.3 in the present work as recommended 
in Ref. [IHDj T is the length of initial time series, and [rT] represents the integer part of rT. In this 
regression, Ht = ^ ^ = 2d, where hn is relative transition path in Eq. (fTTl) and d is the 

least squares estimate of a. The null hypothesis of convergence can be tested by applying a conventional 
one-side t-test for the slope coefficient b > 0. For example, if the point estimate b is significantly less 
than zero, the null hypothesis of convergence is rejected. Specifically, at the 5% significance level, the 
null hypothesis of convergence is rejected if the t statistic of b is less than -1.65, that is, tj, < —1.65. 

However, the rejection of full convergence does not imply the absence of convergence in subgroups 
of the panel. Phillips and Sul propose the following clustering algorithm to find a core convergence 
subgroup [29]: 

1. Order the cities in the panel according to the last observation 

2. Find core cities in the panel by running the logt regression for the k highest cities with 2 < k < N, 
and calculate the convergence t-statistic tk- The core cities size is chosen on the basis of the 
maximum tk with tk > —1.65. 

3. Add one city at a time to the k core member (step 2) and perform the logt test. If the resulting tk 
is greater than zero, a first convergence club is constituted. 

4. Run a logt regression for the remaining cities in the panel and check if the convergence criterion is 
met. If this group satisfies the convergence test, then these members form a second convergence club. 
Otherwise, repeat step 1 to 3 to see if the remaining set can be further subdivided into convergence 
clusters. If no core group can be formed in, then these cities exhibit a divergent behavior. 


Results 

Raw correlation matrix of the raw HPI series 

The correlation matrix Cy, whose element Cij is the Pearson correlation coefficient between the HPI series 
yi{t) of cities i and j has been studied. In the meanwhile, we also investigate the correlation matrix of 
the trend component of yi(t) series, where the trend component j/hp is obtained by eliminating the cyclical 
component from yi{t) using HP filter [30]. Figure [2] illustrates the correlation matrices of Cy and 
along with their corresponding affinity matrices Ay and Ayj^^ obtained by the box clustering method |20j . 
Although box clustering method provides some block clusters in the affinity matrix, the extremely high 
correlation coefficients in the correlation matrix make the cluster results far-fetched. Nevertheless, we 
can still observe two clusters of cities in Fig. |5]b, in which Beijing, Guangzhou, Shenzhen and Shanghai 
are widely recognized as the first-tier cities in the Chinese housing market. 

However, the eigenvalues and eigenvectors of Cy afford us some important information. Observing 
Ai and vi in Tabled] we can find that vi contains practically identical components with the same signs 
(positive here) and the contribution percent of Ai has reached an extremely high level with ipi = 96.5%. 
According to the Random Matrix Theory [16] . we conclude that there is a strong collective force driving 
these HPI series rising. The largest eigenvalue Ai and its eigenvector vi could adequately quantify the 
qualitative notion of the collective response of the entire system to stimuli. However, the partitioning 
effect of other large eigenvalues observed in stock markets is not evident for the 10 key cities. 

Partial correlation of the raw HPI series 

According to Table dl the projection on vi would contain 96.5% of the total variability of the 10 cities’ 
HPIs. Thus we compute the eigenportfolio associated with Ai as follows: 

G{t) = ujy{t), 


(15) 
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Figure 2. Box clustering analysis of correlation matrices, (a) Correlation matrix of the 10 
cities’ initial HPI series yi{t) ordered according to the box clustering method, (b) Corresponding 
affinity matrix of the raw matrix, (c) Correlation matrix of yhp,i(t) ordered by box clustering method, 
which is the the trend component of yi(t) after eliminating the cyclical components by way of HP filter, 
(d) Corresponding affinity matrix of yh-p,i{t)- 




















Table 1. Eigenvalues and eigenvectors of the raw correlation matrix. Lpi is the percent of 
variability explained by the corresponding Ai. (pi is the cumulative percent of variability explained by 



Vi 

V2 

V3 

VA 

V5 

vq 

V7 

Vs 

V9 


Chengdu 

0.312 

0.318 

-0.609 

-0.005 

-0.549 

0.206 

0.146 

-0.18 

0.155 

0.1 

Chongqing 

0.315 

0.285 

-0.181 

-0.481 

0.661 

0.254 

0.175 

0.111 

0.073 

-0.074 

Hangzhou 

0.314 

0.538 

0.231 

-0.15 

-0.201 

-0.473 

-0.376 

0.312 

-0.119 

-0.137 

Nanjing 

0.315 

0.208 

0.304 

0.674 

0.143 

0.289 

0.03 

0.152 

0.428 

0.001 

Tianjin 

0.313 

-0.305 

0.479 

-0.381 

-0.403 

0.202 

0.417 

0.22 

0.067 

-0.062 

Wuhan 

0.32 

-0.017 

0.21 

0.017 

-0.002 

0.46 

-0.398 

-0.345 

-0.563 

0.214 

Shanghai 

0.32 

0.003 

0.162 

0.073 

0.121 

-0.434 

0.309 

-0.716 

0.02 

-0.232 

Guangzhou 

0.317 

-0.376 

-0.342 

0.255 

0.034 

-0.003 

-0.043 

0.274 

-0.31 

-0.634 

Shenzhen 

0.317 

-0.468 

-0.11 

-0.199 

0.045 

-0.136 

-0.542 

-0.086 

0.531 

0.158 

Beijing 

0.32 

-0.176 

-0.149 

0.191 

0.142 

-0.355 

0.285 

0.269 

-0.269 

0.66 

A, 

9.65 

0.113 

0.104 

0.058 

0.036 

0.018 

0.012 

0.005 

0.002 

0.001 


96.5% 

1.13% 

1% 

0.58% 

0.36% 

0.18% 

0.12% 

0.05% 

0.02% 

0.01% 

4^i 

96.5% 

97.63% 

98.67% 

99.25% 

99.62% 

99.8% 

99.92% 

99.97% 

99.99% 

100% 


which can be treated as the benchmark of the collective rising trend, is a 1 x 10 vector whose 
components are the square components of vi and y(t) is a 10 x 101 matrix which contains the original 
HPl series of the 10 cities. The square and normalization procedure from vi to ui is to make sure that 
the sum of its components is 1 and G{t) has identical magnitude with the initial HPl series. 

With the eigenportfolio G{t) acting as the collective trend, we can calculate the partial correlation 
matrix Py, whose elements Pij are the partial correlation coefficients of cities i and j. Figure [3] demon¬ 
strates Py and its corresponding affinity matrix Ap^^. One can observe that, the correlation relationship 
between the yi{t) weakens after eliminating the impact of G{t) in the sense that the elements of Py are 
significantly smaller than those of Cy. 

According to the affinity matrix Ap in Fig.jSjo, we observe three clusters of cities: Clubi: Chongqing 
(CQ) and Chengdu (CD); Club 2 : Beijing (BJ), Guangzhou (GZ) and Shenzhen (SZ); and Clubs: Shanghai 
(SB), Nanjing (NJ), Wuhan (WH), Hangzhou (HZ) and Tianjin (TJ). The residual series £i{t) and their 
corresponding clubs are demonstrated in FiglS]:. It is evident that the trajectories in the same club have 
a similar pattern, while the paths in different clubs exhibit different shape. 

Decomposition of correlation matrix of the raw HPl series 

We decompose the raw matrix correlation matrix Cy into the market effect part Cm and the residual 
part Cb, as illustrated in Fig. |4^ and b. We can see that elements of Cb are much smaller than the corre¬ 
sponding elements in Cm- This suggests that the majority of the extremely large correlation coefficients 
of Cy come from the marketwide collective trend. For the raw HPl series little information is left 

after removing the strong collective trend. 

Figure |4): shows the affinity matrix Ab by implementing the box clustering method on the residual 
matrix Cb. The cluster in the center of Ab contains SZ, GZ and BJ, which is the most evident. At the 
northwest corner of Ab, we see another cluster containing NJ, WH, SH and HZ. The cluster for CQ, CD 
and TJ is not clear. Therefore, the decomposed components of the raw matrix are also able to categorize 
city clusters with similar evolution of the house price indexes. 
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Figure 3. Partial correlation analysis, (a) The partial correlation matrix of initial HPI series, (b) 
The corresponding affinity matrix obtained by the box clustering method, (c) Residual series ei{t) 
obtained by regressing yi(t) in respect to the collective trend G{t). Varying colors and marks represent 
block clubs obtained by box clustering method. 
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NJ WH SH HZ SZ BJ GZ CD CQ TJ 



NJ WH SH HZ SZ BJ GZ CD CQ TJ NJ WH SH HZ SZ BJ GZ CD CQ TJ 


Figure 4. Decomposition of the raw correlation matrix, (a) Matrix Cm reflecting the market 
effect. (b)Residual matrix Cf,. (c) Affinity matrix At of the residual matrix Ct- 


Analysis on differentials between HPI series yi{t) and the collective trend 

We now turn to investigate the relative behavior of the HPI to their collective trends. The deviation 
from the collective trend can be quantified by the differential between regional HPI yi{t) and collective 
trend benchmark G{t): 

Diit) = y^(t)-G{t). (16) 

In Fig. [S^, we show the deviation paths Di{t) of the ten cities. One can intuitively observe that the ten 
paths fall into two groups according to their trends: rising up and falling down. The rising-up group 
contains Shenzhen, Beijing and Guangzhou, showing high-than-average house price growth. 

Fig- Eh shows the cross-correlation matrix Cjj of the deviation paths. There are two obvious blocks 
with positive correlations within the blocks and negative correlations between the blocks, consistent with 
the opposite trends in the differentials in Fig. Eh- The collective effect has been successfully removed. We 
also apply the box clustering method [2^ to the correlation matrix Co- By sorting the cities according 
to the orders of affinity matrix A, two significant clubs are visualized in Fig. [5]:. Clubj includes three 
cites Shenzhen, Guangzhou and Beijing, and Club 2 includes seven cities Shanghai, Ghengdu, Nanjing, 
Tianjin, Wuhan, Hangzhou and Ghongqing. 

The eigenvectors and eigenvalues of the differential matrix Cjj are presented in Table [21 We find that 
the components of the largest eigenvector vi have positive and negative signs, corresponding respectively 
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Figure 5. Analyzing the differentials between HPI series yi{t) and the collective trend, (a) 

Evolution of the differentials between the HPI time series and the collective trend. The three red 
dashed lines correspond to Beijing, Shenzhen and Guangzhou in Clubi, while the blue dashed lines 
correspond to the seven remaining cities in Club 2 . The two continuous lines decorated with solid circles 
highlight the individual collective trends Gciubi (t) and Gciuba {t) of the two clubs respectively, (b) The 
correlation matrix Cjj determined by the deviation paths of the 10 cities, (c) Corresponding affinity 
matrix Ao obtained by box clustering method. 
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to the two clubs identified in Fig. [5] It is not surprising that the largest eigenvector does not reflect the 
market effect any longer but includes some grouping information, which is reminiscent of the US housing 
market m- The nine remaining eigenvectors do not possess much economic information. Note that Ai is 
much larger than other Ai’s and the large value of (j)i also indicates the high systemic risk in the Chinese 
housing market. 

Table 2. Eigenvectors and eigenvalues of the differential matrix Cu. The eigenvalues Ai and 
the corresponding eigenvectors Vi of the correlation matrix Cd of deviation paths Di(t), 
i = 1, 2, • • ■ , 10. (fi is the percent of variability explained by the corresponding Ai. (pi is the cumulative 
percents of variability explained by Ai, A 2 , • • • , Ai. 



Vi 

V2 

V3 

V4 

V5 

Vq 

Vl 

Vs 

1^9 

Vio 

Shenzhen 

0.346 

0.149 

0.139 

0.046 

-0.056 

-0.278 

0.463 

-0.077 

-0.338 

0.649 

Guangzhou 

0.227 

-0.732 

-0.082 

-0.421 

-0.069 

0.319 

0.241 

0.238 

0.009 

0.087 

Beijing 

0.343 

0.172 

-0.169 

-0.036 

0.143 

0.274 

-0.504 

0.051 

0.429 

0.536 

Shanghai 

-0.271 

0.416 

-0.488 

-0.628 

-0.308 

0.005 

0.15 

-0.012 

0 

0.065 

Ghengdu 

-0.332 

-0.307 

-0.034 

0.129 

-0.317 

0.205 

-0.152 

-0.724 

-0.085 

0.284 

Nanjing 

-0.332 

-0.113 

-0.216 

-0.158 

0.8 

-0.049 

-0.085 

-0.052 

-0.346 

0.183 

Tianjin 

-0.273 

0.203 

0.783 

-0.369 

0.027 

0.326 

-0.056 

0.084 

-0.039 

0.129 

Wuhan 

-0.345 

-0.162 

0.102 

-0.011 

0.142 

-0.382 

0.33 

-0.031 

0.73 

0.19 

Hangzhou 

-0.331 

0.136 

-0.19 

0.49 

-0.013 

0.559 

0.371 

0.342 

0.006 

0.169 

Ghongqing 

-0.34 

-0.216 

0 

0.088 

-0.339 

-0.369 

-0.415 

0.531 

-0.2 

0.291 

A, 

8.016 

0.984 

0.521 

0.273 

0.112 

0.054 

0.023 

0.011 

0.006 

0 

T’i 

80.16% 

9.84% 

5.21% 

2.73% 

1.12% 

0.54% 

0.23% 

0.11% 

0.06% 

0% 

4^i 

80.16% 

90% 

95.21% 

97.94% 

99.06% 

99.6% 

99.83% 

99.94% 

100% 

100% 


We further study the two correlation matrices of the differentials in Clubi and Club 2 , resulting in 
a 3 X 3 matrix for Clubi and a 7 x 7 matrix for Club 2 . The eigenvalues and eigenvectors are listed in 
the Table [H It is found that the components of the two eigenvectors vi associated with the two largest 
eigenvalues Ai have same signs and relatively similar magnitudes. Hence, these eigenvectors indicate 
the presence of a collective behavior within the two subsystems [12]. It is interesting to notice that the 
relative magnitudes of vi components for the two clubs are similar to the whole matrix in Table [T| For 
instance, the results show that ui^henzhen = 0.622, Ui^cuangzhou = 0.474 and ui^Beijing = 0.623 for Clubi 
(Table [2]) and ui, Shenzhen = 0.346, ui^Guangzhou = 0.227 and ui, Beijing = 0.343 for the whole matrix (Table 
[T|). In both cases, we have ui^henzhen : i"!,Guangzhou : I"!,Beijing ~ 3 I 2 : 3. It implies that Shenzhen 
and Beijing dominate the collective behavior in Clubi, while the contribution of Guangzhou is relatively 
smaller. For Club 2 , Shanghai and Tianjin have relative small vi components, while other components 
are close to each other. We determine the eigenportfolios of both clubs according to Eq. (USD to extract 
their common trends Gciubi(f) and Gciub 2 (i)! which are demonstrated in Fig. |5b. Obviously, Gciubi(i) 
has an increasing trend and Gciub 2 (t) has a decreasing trend. It does not mean that the house prices in 
Clubi rise up while the house prices in Club 2 fall down. Instead, it means that the house prices in Clubi 
grow faster than average while the house prices in Club 2 grows slower than average. Different from the 
case of the UK housing market [31j . the eigenportfolios of the two clubs are non-stationary and present 
remarkable trends over time t. 

The relative small percents of ipi = 78% for Clubi and ipi = 81.6% for Club 2 implies that, there are 
still remarkable portions of variabilities hidden in the rest of eigenvalues and eigenvectors after extract¬ 
ing the common trend of Gciubi(i) and Gciub 2 (f)- Scrutinizing the contents of eigenvectors of Clubi, 
we already notice that the loading of Guangzhou on Gciubi(i) with Ui^ouangzhou = 0.474 is relative 
smaller than the other two cites with ui^shenzhen = 0.622 and ui^Beijing = 0.623, which indicates that 
Guangzhou, as a component of Gciubi(f), doesn’t contribute to the club’s collective tendency as signif- 
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Table 3. The eigenvalues Xi and the corresponding eigenvectors Vi of the cross-correlation 
matrix of deviation paths of Clubi and Club 2 respectively. The two clubs are obtained from 
the box clustering method shown in Fig. [Sjc. (pi is the percents of variability explained by the 
corresponding A^. 4 >i is the cumulative percents of variability explained by Ai, A 2 , • • • A^. 


Glubi 

Vi 

V2 

V 3 





Shenzhen 

0.622 

0.339 

0.706 





Guangzhou 

0.474 

- 0.88 

0.004 





Beijing 

0.623 

0.332 

-0.708 





A. 

2.34 

0.611 

0.049 





Pi 

78% 

20.4% 

1 . 6 % 






78% 

98.4% 

100 % 





Glub 2 

Vi 

V2 

V 3 

V4 

V 5 

V6 

vr 

Shanghai 

0.32 

0.864 

-0.136 

0.172 

-0.307 

-0.029 

-0.081 

Chengdu 

0.396 

-0.284 

0.275 

0.011 

-0.382 

-0.709 

-0.198 

Nanjing 

0.396 

0.06 

0.198 

0.458 

0.693 

-0.172 

0.283 

Tianjin 

0.32 

-0.228 

-0.9 

-0.049 

0.048 

-0.136 

0.11 

Wuhan 

0.41 

-0.214 

0.031 

0.194 

0.059 

0.495 

-0.707 

Hangzhou 

0.388 

0.154 

0.154 

-0.847 

0.288 

0.045 

0.009 

Chongqing 

0.404 

-0.217 

0.178 

0.06 

-0.436 

0.449 

0.602 

A. 

5.712 

0.524 

0.477 

0.148 

0.106 

0.023 

0.011 

Pi 

81.6% 

7.5% 

6 . 8 % 

2 . 1 % 

1.5% 

0.3% 

0 . 2 % 

4^i 

81.6% 

89.1% 

95.9% 

98% 

99.5% 

99.8% 

100 % 


icantly as Shenzhen and Beijing do. Conversely, the loading of U 2 , Guangzhou = —0.88 would lead V 2 to 
having a large inverse participation ratio (IPR), which is often applied in localization theory, suggest¬ 
ing that V 2 is localized due to the significant contribution of Guangzhou on it. Therefore eigenvector 
V 2 would include information of heterogeneity of Guangzhou in Clubi. The eigenvector V 3 also con¬ 
tains significant participation contents, namely the loadings ua^henzhen = 0.706 and U 3 ,Beijing = —0.708, 
with relative negative signs. This pair of components in eigenvector U 3 associated with the smallest 
eigenvalue A 3 highlights a considerable linear relationship between the two participants Shenzhen and 
Beijing, which has the largest correlation coefficient Gshenzhen, Beijing = 0.9509 in Clubi [IS]. Investigat¬ 
ing Club 2 in the same way, we find that cities like Shanghai, Tianjin and Hangzhou do not contribute 
to the collective tendency as significantly as other cities do, according to their relative small loadings 
on vi- Their heterogeneities have dispersed in the rest eigenvectors with significantly “large” loadings 
like W 2 , Shanghai = 0.864, U 3 , Tianjin = —0.9 and U 4 , Hangzhou = —0.847. In addition, one can still observe 
less heterogeneity in the V 5 components U 5 , Nanjing = 0.693 and ue, Chengdu = —0.709. Similarly, pairs of 
components ur^wuhan = —0.707 and uy^chongqing = 0.602 highlights the strong linearity between the two 
cities with a large correlation coefficient Cwuhan,Chongqing = 0.977. 

logt test analysis of convergence 

The blocks or clusters identified so far are obtained by different methods based on linear correlation 
coefficients. It is not unusual that there are nonlinear relationships between elements in complex economic 
systems. Therefore, we adopt an alternative econometric technique called the logt test to consolidate our 
results [29]. The null hypothesis of convergence can be tested by applying a conventional one-side t-test 
for the slope coefficient b > 0. If the point estimate b is signihcantly less than zero, the null hypothesis 
of convergence is rejected. At the 5% significance level, the critical value is tc = —1.65. 

Table [4] reports the results of the logt test. The null hypothesis of overall convergence of 10 cities’ 
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HPI is rejected at the 5% significance level since the t-statistic ty = —51.25 is far less than the critical 
value —1.65. However, our analysis identifies four clubs: Clubi (Beijing, Shenzhen), Club 2 (Shanghai, 
Guangzhou), Clubs (Tianjin, Hangzhou), and Club 4 (Nanjing, Wuhan, Chengdu, Chongqing). All the h 
coefficients are positive and the t-statistics are larger than t^- The identification of Clubi with Shenzhen 
and Beijing is consistent with the results from linear methods. In addition, grouping Shanghai and 
Guangzhou as Club 2 meets our common perception of the Chinese housing market. 


Table 4. Club convergence obtained by the logt test. 



Clubi 

Club 2 

Cluba 

Club4 

All Cities 

Shenzhen 

Beijing 

Shanghai 

Guangzhou 

Hangzhou 

Tianjin 

Nanjing 

Wuhan 

Chengdu 

Chongqing 

tg = -51.25 

fg = 0.60 

cr> 

II 

1 

o 

bo 

ti = 2.61 

t-, = 2.44 

S= -0.86 

6 = 0.038 

6 = 0.060 

6 = 0.47 

6 = 0.095 


In Fig. [6l we illustrate the relative transitional paths hi{t) of the 10 cities. For each club, the 
evolution of the relative transitional paths hi{t) may be relatively irrelevant at the early stage. However, 
they exhibit a clear convergence in the latest years. 




2005/7 2006/7 2007/7 2008/7 2009/7 2010/7 2011/7 2012/7 2013/7 

t 

Figure 6. The relative transitional paths hi{t) of the 10 cities. Different colors stand for 
different clubs obtained by the logt test. 


Conclusion and discussion 

In summary, we aimed at quantifying the behaviors of HPI series of 10 key cities of China, based on both 
linear and non-linear approaches. An extremely strong collective trend has been detected, driving all 
the HPI series rising. Simultaneously, according to the investigation of partial correlation and residual 
information matrix, it also shows that correlations between series basically come from this collective 
trend. 
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The relative behaviors of HPI series to their collective trend are also studied. Deviation paths, which 
is quantified by the differentials between HPI series and their collective trend, are grouped into two clubs, 
with Clubi consisting of Shenzhen, Guangzhou and Beijing and Club 2 consisting of Shanghai, Chengdu, 
Nanjing, Tianjin, Wuhan, Hangzhou and Chongqing. Members between the two clubs are anti-correlated, 
corresponding to the deviation paths going towards opposite directions. It suggests that the rising of the 
Chinese HPI is driven by a minority of cities like those in Clubi. Some heterogeneities for the two clubs 
can be observed by investigating the eigenvalues and eigenvectors of the correlation matrix. 

A recent panel convergence test, namely log t test, has been applied to examine the convergence of the 
HPI series. It reveals that 10 cities’ HPI series do not form a homogeneous convergence club. Instead, our 
results identify four city clubs that converge to different steady states. In subsequent studies, it would 
be quite interesting to tackle the lead-lag structure of the HPIs to pinpoint the propagation mechanisms 
within the Chinese housing market. 
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